02. Gridworld Example

Gridworld Example

L602 Gridworld Example RENDER V2-2

## Quiz

To check your understanding of the environment, please answer the questions below.

What is the size of the set \mathcal{S}^+ of all states (including terminal states)?

SOLUTION: 4

What is the size of the set \mathcal{A} of all actions?

SOLUTION: 4

In the Gridworld Example video, it was mentioned that the discount rate \gamma = 1. With this in mind, which of the following must be true?

SOLUTION: The reward is not discounted.

Suppose that at some time step t, the agent is in state 2 and selects action "up". Which of the following is a possible reward and next state that the agent could receive at time step t+1? (Select all that apply.)

SOLUTION:
  • reward: -1 | next state: 2
  • reward: -1 | next state: 1
  • reward: -1 | next state: 3

Which of the following choices describes the optimal policy \pi_*?

SOLUTION: The agent should select "up" in state 1, "right" in state 2, and "down" in state 3.